678 research outputs found

    Fast multi-exposure image fusion with median filter and recursive filter

    Full text link

    Hyperspectral Remote Sensing Benchmark Database for Oil Spill Detection with an Isolation Forest-Guided Unsupervised Detector

    Full text link
    Oil spill detection has attracted increasing attention in recent years since marine oil spill accidents severely affect environments, natural resources, and the lives of coastal inhabitants. Hyperspectral remote sensing images provide rich spectral information which is beneficial for the monitoring of oil spills in complex ocean scenarios. However, most of the existing approaches are based on supervised and semi-supervised frameworks to detect oil spills from hyperspectral images (HSIs), which require a huge amount of effort to annotate a certain number of high-quality training sets. In this study, we make the first attempt to develop an unsupervised oil spill detection method based on isolation forest for HSIs. First, considering that the noise level varies among different bands, a noise variance estimation method is exploited to evaluate the noise level of different bands, and the bands corrupted by severe noise are removed. Second, kernel principal component analysis (KPCA) is employed to reduce the high dimensionality of the HSIs. Then, the probability of each pixel belonging to one of the classes of seawater and oil spills is estimated with the isolation forest, and a set of pseudo-labeled training samples is automatically produced using the clustering algorithm on the detected probability. Finally, an initial detection map can be obtained by performing the support vector machine (SVM) on the dimension-reduced data, and then, the initial detection result is further optimized with the extended random walker (ERW) model so as to improve the detection accuracy of oil spills. Experiments on airborne hyperspectral oil spill data (HOSD) created by ourselves demonstrate that the proposed method obtains superior detection performance with respect to other state-of-the-art detection approaches

    An Information-theoretic analysis of generative adversarial networks for image restoration in physics-based vision

    Full text link
    Image restoration in physics-based vision (such as image denoising, dehazing, and deraining) are fundamental tasks in computer vision that attach great significance to the processing of visual data as well as subsequent applications in different fields. Existing methods mainly focus on exploring the physical properties and mechanisms of the imaging process, and tend to use a deconstructive idea in describing how the visual degradations (like noise, haze, and rain) are integrated with the background scenes. This idea, however, relies heavily on manually engineered features and handcrafted composition models, which can be theories only in ideal conditions or hypothetical models that may involve human bias or fail in simulating true situations in actual practices. With the progress of representation learning, generative methods, especially generative adversarial networks (GANs), are considered a more promising solution for image restoration tasks. It directly learns the restorations as end-to-end generation processes using large amounts of data without understanding their physical mechanisms, and it also allows completing missing details damaged information by involving external knowledge and generating plausible results with intelligent-level interpretation and semantics-level understanding of the input images. Nevertheless, existing studies that try to apply GAN models to image restoration tasks dose not achieve satisfactory performances compared with the traditional deconstructive methods. And there is scarcely any study or theory to explain how deep generative models work in relevant tasks. In this study, we analyzed the learning dynamics of different deep generative models based on the information bottleneck principle and propose an information-theoretic framework to explain the generative methods for image restoration tasks. In which, we study the information flow in the image restoration models and point out three sources of information involved in generating the restoration results: (i) high-level information extracted by the encoder network, (ii) low-level information from the source inputs that retained, or pass directed through the skip connections, and, (iii) external information introduced by the learned parameters of the decoder network during the generation process. Based on this theory, we pointed out that conventional GAN models may not be directly applicable to the tasks of image restoration, and we identify three key issues leading to their performance gaps in the image restoration tasks: (i) over-invested abstraction processes, (ii) inherent details loss, and (iii) imbalance optimization with vanishing gradient. We formulate these problems with corresponding theoretical analyses and provide empirical evidence to verify our hypotheses and prove the existence of these problems respectively. To address these problems, we then proposed solutions and suggestions including optimizing network structure, enhancing details extraction and accumulation with network modules, as well as replacing measures of training objectives, to improve the performances of GAN models on the image restoration tasks. Ultimately, we verify our solutions on bench-marking datasets and achieve significant improvement on the baseline models

    Biological Inspiration from Dynamic Modeling of Human Hand and Trap-Jaw Ant Mandible Movements

    Get PDF

    Robust Normalized Softmax Loss for Deep Metric Learning-Based Characterization of Remote Sensing Images With Label Noise

    Get PDF
    Most deep metric learning-based image characterization methods exploit supervised information to model the semantic relations among the remote sensing (RS) scenes. Nonetheless, the unprecedented availability of large-scale RS data makes the annotation of such images very challenging, requiring automated supportive processes. Whether the annotation is assisted by aggregation or crowd-sourcing, the RS large-variance problem, together with other important factors [e.g., geo-location/registration errors, land-cover changes, even low-quality Volunteered Geographic Information (VGI), etc.] often introduce the so-called label noise, i.e., semantic annotation errors. In this article, we first investigate the deep metric learning-based characterization of RS images with label noise and propose a novel loss formulation, named robust normalized softmax loss (RNSL), for robustly learning the metrics among RS scenes. Specifically, our RNSL improves the robustness of the normalized softmax loss (NSL), commonly utilized for deep metric learning, by replacing its logarithmic function with the negative Box–Cox transformation in order to down-weight the contributions from noisy images on the learning of the corresponding class prototypes. Moreover, by truncating the loss with a certain threshold, we also propose a truncated robust normalized softmax loss (t-RNSL) which can further enforce the learning of class prototypes based on the image features with high similarities between them, so that the intraclass features can be well grouped and interclass features can be well separated. Our experiments, conducted on two benchmark RS data sets, validate the effectiveness of the proposed approach with respect to different state-of-the-art methods in three different downstream applications (classification, clustering, and retrieval). The codes of this article will be publicly available from https://github.com/jiankang1991

    Noise-Tolerant Deep Neighborhood Embedding for Remotely Sensed Images With Label Noise

    Get PDF
    Recently, many deep learning-based methods have been developed for solving remote sensing (RS) scene classification or retrieval tasks. Most of the adopted loss functions for training these models require accurate annotations. However, the presence of noise in such annotations (also known as label noise) cannot be avoided in large-scale RS benchmark archives, resulting from geo-location/registration errors, land-cover changes, and diverse knowledge background of annotators. To overcome the influence of noisy labels on the learning process of deep models, we propose a new loss function called noise-tolerant deep neighborhood embedding which can accurately encode the semantic relationships among RS scenes. Specifically, we target at maximizing the leave-one-out K-NN score for uncovering the inherent neighborhood structure among the images in feature space. Moreover, we down-weight the contribution of potential noisy images by learning their localized structure and pruning the images with low leave-one-out K-NN scores. Based on our newly proposed loss function, classwise features can be more robustly discriminated. Our experiments, conducted on two benchmark RS datasets, validate the effectiveness of the proposed approach on three different RS scene interpretation tasks, including classification, clustering, and retrieval. The codes of this article will be publicly available from https://github.com/jiankang1991

    Efficient coding schemes for low‐rate wireless personal area networks

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/166246/1/cmu2bf01608.pd

    Decomposing 3D Neuroimaging into 2+1D Processing for Schizophrenia Recognition

    Full text link
    Deep learning has been successfully applied to recognizing both natural images and medical images. However, there remains a gap in recognizing 3D neuroimaging data, especially for psychiatric diseases such as schizophrenia and depression that have no visible alteration in specific slices. In this study, we propose to process the 3D data by a 2+1D framework so that we can exploit the powerful deep 2D Convolutional Neural Network (CNN) networks pre-trained on the huge ImageNet dataset for 3D neuroimaging recognition. Specifically, 3D volumes of Magnetic Resonance Imaging (MRI) metrics (grey matter, white matter, and cerebrospinal fluid) are decomposed to 2D slices according to neighboring voxel positions and inputted to 2D CNN models pre-trained on the ImageNet to extract feature maps from three views (axial, coronal, and sagittal). Global pooling is applied to remove redundant information as the activation patterns are sparsely distributed over feature maps. Channel-wise and slice-wise convolutions are proposed to aggregate the contextual information in the third view dimension unprocessed by the 2D CNN model. Multi-metric and multi-view information are fused for final prediction. Our approach outperforms handcrafted feature-based machine learning, deep feature approach with a support vector machine (SVM) classifier and 3D CNN models trained from scratch with better cross-validation results on publicly available Northwestern University Schizophrenia Dataset and the results are replicated on another independent dataset

    Brain MRI-based 3D Convolutional Neural Networks for Classification of Schizophrenia and Controls

    Full text link
    Convolutional Neural Network (CNN) has been successfully applied on classification of both natural images and medical images but not yet been applied to differentiating patients with schizophrenia from healthy controls. Given the subtle, mixed, and sparsely distributed brain atrophy patterns of schizophrenia, the capability of automatic feature learning makes CNN a powerful tool for classifying schizophrenia from controls as it removes the subjectivity in selecting relevant spatial features. To examine the feasibility of applying CNN to classification of schizophrenia and controls based on structural Magnetic Resonance Imaging (MRI), we built 3D CNN models with different architectures and compared their performance with a handcrafted feature-based machine learning approach. Support vector machine (SVM) was used as classifier and Voxel-based Morphometry (VBM) was used as feature for handcrafted feature-based machine learning. 3D CNN models with sequential architecture, inception module and residual module were trained from scratch. CNN models achieved higher cross-validation accuracy than handcrafted feature-based machine learning. Moreover, testing on an independent dataset, 3D CNN models greatly outperformed handcrafted feature-based machine learning. This study underscored the potential of CNN for identifying patients with schizophrenia using 3D brain MR images and paved the way for imaging-based individual-level diagnosis and prognosis in psychiatric disorders.Comment: 4 PAGE

    Transient Characteristics and Quantitative Analysis of Electromotive Force for DFIG-based Wind Turbines during Grid Faults

    Get PDF
    corecore